16 research outputs found
Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities
Recently, we proposed to transform the outputs of each hidden neuron in a
multi-layer perceptron network to have zero output and zero slope on average,
and use separate shortcut connections to model the linear dependencies instead.
We continue the work by firstly introducing a third transformation to normalize
the scale of the outputs of each hidden neuron, and secondly by analyzing the
connections to second order optimization methods. We show that the
transformations make a simple stochastic gradient behave closer to second-order
optimization methods and thus speed up learning. This is shown both in theory
and with experiments. The experiments on the third transformation show that
while it further increases the speed of learning, it can also hurt performance
by converging to a worse local optimum, where both the inputs and outputs of
many hidden neurons are close to zero.Comment: 10 pages, 5 figures, ICLR201
Deep Learning of Representations: Looking Forward
Deep learning research aims at discovering learning algorithms that discover
multiple levels of distributed representations, with higher levels representing
more abstract concepts. Although the study of deep learning has already led to
impressive theoretical results, learning algorithms and breakthrough
experiments, several challenges lie ahead. This paper proposes to examine some
of these challenges, centering on the questions of scaling deep learning
algorithms to much larger models and datasets, reducing optimization
difficulties due to ill-conditioning or local minima, designing more efficient
and powerful inference and sampling procedures, and learning to disentangle the
factors of variation underlying the observed data. It also proposes a few
forward-looking research directions aimed at overcoming these challenges
Safe transient operation of microgrids based on master-slave configuration
Master-Slave configuration is a suitable alternative
to droop control method used in microgrids. In this configuration,
only one inverter is the master, while the others are slaves. The
slave inverters are always current controlled whereas the master
inverter should have two selectable operation modes: current
controlled, when the microgrid is connected to the grid; and
voltage controlled, when it is operating in island mode. In gridconnected
mode, the master needs a synchronization system to
perform the accurate control of its delivered power, and, in
island mode, it needs a voltage reference oscillator that serves
as a reference to the slave inverters. Based on the master-slave
concept, this paper proposes a single system that perform both
functions, i.e., it can act as a synchronization system or as
a voltage reference oscillator depending on an input selector.
Moreover, the system ensures a smoothly transition between
the two operation modes, guaranteeing the safety operation of
the microgrid. Experimental results are provided to confirm the
effectiveness of the proposed system.Peer Reviewe
Scalable Neural Networks for Board Games
Learning to solve small instances of a problem should help in solving large instances. Unfortunately, most neural network architectures do not exhibit this form of scalability. Our Multi-Dimensional Recurrent LSTM Networks, however, show a high degree of scalability, as we empirically show in the domain of flexible-size board games. This allows them to be trained from scratch up to the level of human beginners, without using domain knowledge
Towards Adjusting Mobile Devices to User’s Behaviour
Abstract. Mobile devices are a special class of resource-constrained em-bedded devices. Computing power, memory, the available energy, and network bandwidth are often severely limited. These constrained re-sources require extensive optimization of a mobile system compared to larger systems. Any needless operation has to be avoided. Time-consuming operations have to be started early on. For instance, load-ing files ideally starts before the user wants to access the file. So-called prefetching strategies optimize system’s operation. Our goal is to ad-just such strategies on the basis of logged system data. Optimization is then achieved by predicting an application’s behavior based on facts learned from earlier runs on the same system. In this paper, we ana-lyze system-calls on operating system level and compare two paradigms, namely server-based and device-based learning. The results could be used to optimize the runtime behaviour of mobile devices